Skip to content

GH-48837: [C++] Remove invalid DCHECK when allow_delayed_open is true#48836

Open
jiaqizho wants to merge 1 commit intoapache:mainfrom
jiaqizho:fix-fs-invalid-dcheck
Open

GH-48837: [C++] Remove invalid DCHECK when allow_delayed_open is true#48836
jiaqizho wants to merge 1 commit intoapache:mainfrom
jiaqizho:fix-fs-invalid-dcheck

Conversation

@jiaqizho
Copy link
Copy Markdown

@jiaqizho jiaqizho commented Jan 13, 2026

Thanks for opening a pull request!

If this is your first pull request you can find detailed information on how to contribute here:

Please remove this line and the above text before creating your pull request.

Rationale for this change

C++ build with DEBUG mode.

std::string test_data(11 * 1024 * 1024, 'A'); // 11MB which bigger than kMultiPartUploadThresholdSize(10 MB)
ARROW_ASSIGN_OR_RAISE(auto arrowfs, S3FileSystem::Make(...)); // with allow_delayed_open=true

auto write_status = output_stream->Write(test_data.data()); // <--- got core here

Core


    ... 
    frame #7: 0x0000000119029565 arrow::util::ArrowLog::~ArrowLog(this=0x00007ff7bfefb410) at logging.cc:253:23
    frame #8: 0x0000000112fd0591 arrow::fs::ObjectOutputStream::CreateMultipartUpload(this=0x000061300000f018) at s3fs.cc:1652:5
    frame #9: 0x000000011301ceb7 arrow::fs::ObjectOutputStream::UploadPart(this=0x000061300000f018, data=0x0000000108001800, nbytes=10485760, owned_buffer=nullptr) at s3fs.cc:2057:7
    frame #10: 0x000000011301bff4 arrow::fs::ObjectOutputStream::DoWrite(this=0x000061300000f018, data=0x0000000108001800, nbytes=11534336, owned_buffer=nullptr) at s3fs.cc:1864:7
    frame #11: 0x00000001130194aa arrow::fs::ObjectOutputStream::Write(this=0x000061300000f018, data=0x0000000108001800, nbytes=11534336) at s3fs.cc:1828:75
    frame #12: 0x0000000118e862c6 arrow::io::Writable::Write(this=0x000061300000f018, data="AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"...) at interfaces.cc:199:10
    ... 

In ShouldBeMultipartUpload(). The pos_ is 0, because nothing written, allow_delayed_open_ is true, DCHECK won't be success in this case.

What changes are included in this PR?

Remove invalid DCHECK when allow_delayed_open is true

Are these changes tested?

Yes

Are there any user-facing changes?

Nope

This PR includes breaking changes to public APIs. (If there are any breaking changes to public APIs, please explain which changes are breaking. If not, you can remove this.)

This PR contains a "Critical Fix". (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.)

@github-actions
Copy link
Copy Markdown

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@jiaqizho jiaqizho changed the title [C++][FS] Remove invalid DCHECK when allow_delayed_open is true GH-48837: [C++] Remove invalid DCHECK when allow_delayed_open is true Jan 13, 2026
@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #48837 has been automatically assigned in GitHub to PR creator.

@jiaqizho
Copy link
Copy Markdown
Author

@lidavidm @westonpace @wgtmac PTAL

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Removes an invalid debug assertion in the S3 output stream multipart upload path when allow_delayed_open is enabled (fixing a DEBUG-mode crash when the first write triggers multipart upload creation).

Changes:

  • Removed DCHECK(ShouldBeMultipartUpload()) from ObjectOutputStream::CreateMultipartUpload() to avoid an incorrect invariant when pos_ == 0 but a multipart upload is still legitimately created (e.g., large first write).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@hrishikeshh-shinde hrishikeshh-shinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DCHECK assumes pos_ has already been updated before CreateMultipartUpload() is called, but with allow_delayed_open the first write can trigger a multipart upload while pos_ is still 0. Removing the DCHECK fixes the crash. The MATLAB CI failure looks unrelated to this one-line change.

@github-actions github-actions Bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Apr 21, 2026
@jiaqizho
Copy link
Copy Markdown
Author

The DCHECK assumes pos_ has already been updated before CreateMultipartUpload() is called, but with allow_delayed_open the first write can trigger a multipart upload while pos_ is still 0. Removing the DCHECK fixes the crash. The MATLAB CI failure looks unrelated to this one-line change.

@hrishikeshh-shinde would u like help to retrigger the CI? thanks

@hrishikeshh-shinde
Copy link
Copy Markdown
Contributor

The DCHECK assumes pos_ has already been updated before CreateMultipartUpload() is called, but with allow_delayed_open the first write can trigger a multipart upload while pos_ is still 0. Removing the DCHECK fixes the crash. The MATLAB CI failure looks unrelated to this one-line change.

@hrishikeshh-shinde would u like help to retrigger the CI? thanks

I think only members can trigger the CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants